Abstract
This project aims to take the technical analysis to another level rather than the traditional; using programming techniques, backtesting, simulations and optimization, but always keeping the fundamentals of TA, i.e., using some traditional technical analysis tools. We will work with 3 of them:
The Hypothesis is that, when combining this metrics, backtesting and optimizing, we will generate a profit that at the end date, would be better than a traditional and empirical trading strategy.
Trading consists of negotiating assets in the market, with the objective of obtaining profits. Some strategies have been created and implemented in order to get the most profits possible, such as technical analysis, fundamental analysis, etc. This project aims to take the technical analysis to another level rather than the traditional; using programming techniques, backtesting, simulations and optimization, but always keeping the fundamentals of TA, i.e., using some traditional technical analysis tools. We will work with 3 of them:
The strategy is very simple and as we are working under the perspective of the market microstructure, we will work with times lower than one hour.
A buy signal is going to be given when 2 conditions are met:
The SRSI 'K' param is above 'D' param for the current time and one previous time, i.e.: $K_i > D_i \ \& \ K_{i-1} > D_{i-1}$.
Using 3 Exponential Moving Averages ($ema_1, ema_2, ema_2$) of 3 different lenghts and where: $len(ema_1) < len(ema_2) \ \ \& \ \ len(ema_2) < len(ema_3)$.
These metrics are going to be optimized for a train period, which is from 01/01/2018 to 01/01/2019 this will generate the best parameters and with this parameters a test dataset is going to be simulated; the date for the test is: 01/02/2019 to 01/02/2020. Another parameter for the optimization is a function that weights the sharpe ratio and ...
The Hypothesis is that, when combining this metrics, backtesting and optimizing, we will generate a profit that at the end date, would be better than a traditional and empirical trading strategy.
In order to run this notebook, it is necessary to have installed and/or have the requirements.txt file with the following:
The following are the file dependencies that are needed to run this notebook:
%%capture
# Install all the pip packages in the requirements.txt
import sys
!{sys.executable} -m pip install -r requirements.txt
import data as dt
import main as main
import functions as fn
import visualizations as vz
from IPython.display import display, Image
import plotly.io as pio
pio.renderers.default='notebook'
The main dataset used is a BTCUSDT.csv file containing Bitcoin-Tether data from 01-01-2018 to 02-01-2020 in has the following structure:
# the first 3 rows:
display(dt.BTCUSDT.head(3))
| Open | High | Low | Close | Volume | K | D | EMA_8 | EMA_14 | EMA_40 | ATR | KD_Cross | TP | SL | Buy_signal | Sell_signal | Outcome | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Open Time | |||||||||||||||||
| 2018-01-01 09:45:00 | 13599.99 | 13670.00 | 13571.33 | 13616.99 | 51.550990 | 0.396432 | 0.353634 | 13587.796139 | 13589.883255 | 13529.312000 | 136.981575 | False | 13760.820654 | 13480.8201 | 0 | 0 | NaN |
| 2018-01-01 10:00:00 | 13632.00 | 13657.92 | 13540.33 | 13550.00 | 50.432284 | 0.392360 | 0.387438 | 13579.396997 | 13584.565488 | 13530.321171 | 135.521109 | False | 13692.297165 | 13414.5000 | 0 | 0 | NaN |
| 2018-01-01 10:15:00 | 13550.00 | 13560.98 | 13497.98 | 13549.03 | 47.520692 | 0.348364 | 0.379052 | 13572.648776 | 13579.827423 | 13531.233797 | 130.080363 | False | 13685.614381 | 13413.5397 | 0 | 0 | NaN |
1- Initial Capital: $100,000 \ \text{USD}$.
2- Maximum risk per trade: $1,000 \ \text{USD}$. We will cover this using a fraction of the capital per trade.
3- Divide the data in train and test.
train: Jan/01/2018 - Jan/01/2019
test: Feb/01/2019 - Feb/01/2020
1. Data usage criteria.
Instrument: BTCUSDT, Cryptocurrency.
Time interval: 3, 15, 30 minutes.
Data structure: Open, High, Low, Close, Volume data.
2. Signal generation criteria.
Buy signals are generated when the 'K' SRSI param is greater than 'D' param (cross) AND when minor lenght EMAs are greater than major lenght EMAs, i.e.:
$ema_1 > ema_2 \ \ \& \ \ ema_2 > ema_3$.
Sell signals are generated with a 'D'/'K' cross and when a case for a Take Profit or Stop Loss is met.
3. Take Profit/Stop Loss criteria.
For this criteria we decide using a scalar, but it is always changing as it depends on the Price, Average True Range and a scalar.
Take Profit Criteria:
price + atr*scalar
Stop Loss Criteria:
price - atr*scalar
4. Position sizing criteria..
This criteria is also a scalar lower than 10% of total capital, but it is going to be optimized.
vz.strategy_test_viz(main.test, main.emas, True)
Below is a comparative table of the results of the three performance attribution metrics that were proposed to be analyzed for this trading strategy, using the data on the evolution of accumulated capital, both from the training period and the trial period, the metrics to be used being the following:
It is calculated by subtracting the risk-free rate from the average of the logarithmic returns of the movements, and dividing this result by the standard deviation of the movements (log returns). In general, the higher the value of the Sharpe ratio, the more attractive the risk-adjusted return.
is a variation of the Sharpe ratio that differentiates harmful volatility from total overall volatility by using the asset's standard deviation of negative portfolio returns—downside deviation—instead of the total standard deviation of portfolio returns. The Sortino ratio takes an asset or portfolio's return and subtracts the risk-free rate, and then divides that amount by the asset's downside deviation.
main.MAD
| MAD | Train | Test | |
|---|---|---|---|
| 0 | Sharpe Ratio | 0.0 | 0.0 |
| 1 | Sortino Ratio | 0.0 | 0.0 |
| 2 | Calmar Ratio | 0.0 | 0.0 |
Sharpe Ratio:
If the index or Sharpe ratio is negative, it indicates that the performance of the movements is lower than the return without risk. Any value of the Sharpe ratio less than one means that the return on the assets is less than the risk we are assuming when investing in a given asset.
Sortino Ratio:
The higher the value of the Sortino ratio, the better the rating of the movements, since it means that these are operated efficiently and unnecessary risks are not being taken, which are not being rewarded with higher returns. A low or negative sortino ratio indicates that the investor is not being rewarded for the risks they are taking.
What is SRSI? The stochastic RSI (StochRSI) is a technical indicator used to measure the strength and weakness of the relative strength indicator (RSI) over a set period of time.
How to get the RSIS? For this metric we used this reference https://www.investopedia.com/terms/s/stochrsi.asp as well as a library for technical anaylisis https://technical-analysis-library-in-python.readthedocs.io/en/latest/ta.html, resulting this function:
help(fn.stochrsi_k)
Help on function stochrsi_k in module functions:
stochrsi_k(close: pandas.core.series.Series, window: int = 14, smooth1: int = 3, smooth2: int = 3, fillna: bool = False) -> pandas.core.series.Series
Stochastic Relative Strenght Index K (SRSId)
The SRSI takes advantage of both momentum indicators in order to create a more
sensitive indicator that is attuned to a specific security's historical performance
rather than a generalized analysis of price change.
Args:
close(pandas.Series): dataset 'Close' column.
window(int): n period
smooth1(int): moving average of Stochastic RSI
smooth2(int): moving average of %K
Returns:
pandas.Series: New feature generated.
References:
[1] https://www.investopedia.com/terms/s/stochrsi.asp
What is EMA? It is a technical indicator that shows how the price of an asset changes over a certain period of time. The EMA is different from a simple moving average in that it places more weight on recent data points.
How to get the EMA? For this metric we used the following references:
[1] https://stockcharts.com/school/doku.php?id=chart_school:technical_indicators:moving_averages
and the library pandas_ta resulting in this function:
help(fn.ema)
Help on function ema in module functions:
ema(close, length=None, talib=None, offset=None, **kwargs)
Exponential Moving Average (EMA)
The Exponential Moving Average is more responsive moving average compared to the
Simple Moving Average (SMA). The weights are determined by alpha which is
proportional to it's length. There are several different methods of calculating
EMA. One method uses just the standard definition of EMA and another uses the
SMA to generate the initial value for the rest of the calculation.
Args:
close (pd.Series): Series of 'close's
length (int): It's period. Default: 10
talib (bool): If TA Lib is installed and talib is True, Returns the TA Lib
version. Default: True
offset (int): How many periods to offset the result. Default: 0
Returns:
pd.Series
References:
[1] https://stockcharts.com/school/doku.php?id=chart_school:technical_indicators:moving_averages
[2] https://www.investopedia.com/ask/answers/122314/what-exponential-moving-average-ema-formula-and-how-ema-calculated.asp
What is ATR? The Averge True Range is a tool used in technical analysis to measure volatility.
How to get the RSIS? In order to get this metric we consulted https://www.tradingview.com/wiki/Average_True_Range_(ATR) and this library: pandas_ta
and the function we got is this:
help(fn.atr)
Help on function atr in module functions:
atr(high, low, close, length=None, mamode=None, talib=None, drift=None, offset=None, **kwargs)
Average True Range (ATR)
Averge True Range is used to measure volatility, especially volatility caused by
gaps or limit moves.
Args:
high (pd.Series): Series of 'high's
low (pd.Series): Series of 'low's
close (pd.Series): Series of 'close's
length (int): It's period. Default: 14
mamode (str): See ```help(ta.ma)```. Default: 'rma'
talib (bool): If TA Lib is installed and talib is True, Returns the TA Lib
version. Default: True
drift (int): The difference period. Default: 1
offset (int): How many periods to offset the result. Default: 0
Returns:
pd.Series: New feature generated.
References:
https://www.tradingview.com/wiki/Average_True_Range_(ATR)
These 3 technical analysis tools were used in order to generate Buy or Sell signals:
SRSI
EMA
ATR
They have to met the following conditions:
SRSI (K) needs to be higher than SRSI (D), but only when making a CROSS or break AND:
Close > EMA8 > EMA14 > EMA40.
Below the dataframe we are going to probe every one of the signals and conditions
x = main.test[['High', 'Low', 'Close', 'K', 'D', 'EMA_8', 'EMA_14', 'EMA_40', 'ATR', 'KD_Cross', 'TP', 'SL', 'Buy_signal', 'Sell_signal', 'Outcome', 'buys', 'sells']].head()
x
| High | Low | Close | K | D | EMA_8 | EMA_14 | EMA_40 | ATR | KD_Cross | TP | SL | Buy_signal | Sell_signal | Outcome | buys | sells | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Open Time | |||||||||||||||||
| 2018-01-02 00:00:00 | 13539.54 | 13382.16 | 13490.42 | 0.181130 | 0.075330 | 13448.718135 | 13434.082810 | 13354.042734 | 124.004615 | True | 13620.624845 | 13355.5158 | 1 | 0 | NaN | 13490.42 | NaN |
| 2018-01-02 00:15:00 | 13700.04 | 13467.22 | 13700.04 | 0.492088 | 0.231864 | 13504.567439 | 13469.543768 | 13370.920649 | 131.783017 | False | 13838.412168 | 13563.0396 | 0 | 1 | TP | NaN | 13700.04 |
| 2018-01-02 00:30:00 | 13850.00 | 13689.71 | 13715.00 | 0.825422 | 0.499547 | 13551.330230 | 13502.271266 | 13387.705008 | 133.820659 | False | 13855.511692 | 13577.8500 | 0 | 0 | NaN | NaN | NaN |
| 2018-01-02 00:45:00 | 13770.00 | 13672.89 | 13750.01 | 1.000000 | 0.772503 | 13595.481290 | 13535.303097 | 13405.378422 | 131.196760 | False | 13887.766598 | 13612.5099 | 0 | 0 | NaN | NaN | NaN |
| 2018-01-02 01:00:00 | 13800.00 | 13659.45 | 13662.13 | 0.875926 | 0.900449 | 13610.292114 | 13552.213351 | 13417.902889 | 131.865253 | False | 13800.588516 | 13525.5087 | 0 | 0 | NaN | NaN | NaN |
kd_cross = x['K'][0] > x['D'][0]
print('K/D Cross is:', kd_cross)
K/D Cross is: True
EMA_cond = (x['Close'][0] > x['EMA_8'][0]) & (x['EMA_8'][0] > x['EMA_14'][0]) & (x['EMA_14'][0] > x['EMA_40'][0])
print('EMA Condition is:', EMA_cond)
EMA Condition is: True
When both conditions are true, we get a Boolean Buy Signal (True), else we get a False.
print('Buy signal is:', kd_cross and EMA_cond)
Buy signal is: True
Sell signals are a little bit more complex, because we made som extra steps to avoid false signals or consecutive signals.
selldates = []
outcome = []
for i in range(len(x)):
if x.Buy_signal.iloc[i]:
k = 1
SL = x.SL.iloc[i]
TP = x.TP.iloc[i]
in_position = True
while in_position:
if i + k ==len(x):
break
looping_high = x.High.iloc[i+k]
looping_low = x.Low.iloc[i+k]
if looping_high >= TP:
selldates.append(x.iloc[i+k].name)
outcome.append('TP')
in_position = False
elif looping_low <= SL:
selldates.append(x.iloc[i+k].name)
outcome.append('SL')
in_position = False
k += 1
# We get 2 lists, selldates and outcome
print(selldates, outcome)
[Timestamp('2018-01-02 00:15:00')] ['TP']
They contain info about the sell movement, specifically the date and if it was a sell because of a Take Profit or a Stop Loss.
# Then we localize the date and make a new column named Sell_signal and fill it with 1
# Another column is made and it has the value of the outcome (TP or SL)
# x.loc[selldates, 'Sell_signal'] = 1
# x.loc[selldates, 'Outcome'] = outcome
# With this strategy we are able to buy or sell when all conditions are met.
x[['Buy_signal', 'Sell_signal', 'Outcome', 'buys', 'sells']].head(2)
| Buy_signal | Sell_signal | Outcome | buys | sells | |
|---|---|---|---|---|---|
| Open Time | |||||
| 2018-01-02 00:00:00 | 1 | 0 | NaN | 13490.42 | NaN |
| 2018-01-02 00:15:00 | 0 | 1 | TP | NaN | 13700.04 |
The first step is to calculate the RSI
Where $up_t(n)$ is de the average of the past 'n' timeframes where the price change was higher than 0 and $down_t(n)$ is the avergae of the past 'n' timeframe where the price was below 0
Once we have the RSI we can calculate de Stochastic RSI
Where $RSI_{t}$ = the last RSI in the past n timeframes, $Min(RSI(n))$ = the minimum RSI in the past n timeframes and $Max(RSI(n))$ = the maximum RSI in the past n timeframes
In this strategy we use 3 EMA's and thi is the process to calculate EMA with $n$ lagged period at time $t$
Where the smoothing coefficient $\beta$ is usually: $\beta = \frac{2}{n+1}$
The first thing we have to do in order to calculate de ATR is to get the True Range, for this we can follow this formula:
After this we can get the average now.
Where:
Objective Function
The MAD we are looking to maximize is the Return (%). We did this trough the Backtesting library.
In order to maximize the return, we also define some parameters to optimize:
First Parameter:
Name: SRSI window
Description: Indicates how much window of time the SRSI is going to consider.
Value Type: numeric, int.
Value Range: [8,9,10,11]
Minimum Step Size: 1
Second Parameter:
Name: SRSI K
Description: Stochastic RSI K parameter, moving average.
Value Type: numeric, int.
Value Range: [2, 3, 4]
Minimum Step Size: 1
Third Parameter:
Name: SRSI D
Description: Stochastic RSI D parameter, moving average.
Value Type: numeric, int.
Value Range: [2, 3, 4]
Minimum Step Size: 1
Fourth Parameter:
Name: EMA1 lenght
Description: The period to calculate.
Value Type: numeric, int.
Value Range: [6,7,8,9]
Minimum Step Size: 1
Fifth Parameter:
Name: EMA2 lenght
Description: The period to calculate.
Value Type: numeric, int.
Value Range: [11,12,13,14]
Minimum Step Size: 1
Sixth Parameter:
Name: EMA3 lenght
Description: The period to calculate.
Value Type: numeric, int.
Value Range: [35,37,38,44]
Minimum Step Size: 1
ScreenShot of Params:
*Due to computational matters, we couldn't simply call the opt paramns, so we used SS.
Image(filename = 'files/params.png')
Image(filename = 'files/params2.png')
Search Space:
$\text{number of parameters (m): 4}$
$\text{Possible values (n): 4}$
$\text{Search space (m\^ n)}= 256$
We obtained:
Image(filename = 'files/opt_params.png')
vz.Equity_viz(main.output_train._equity_curve['Equity'].index, main.output_train._equity_curve['Equity'], True)
vz.Equity_viz(main.output_test._equity_curve['Equity'].index, main.output_test._equity_curve['Equity'], True)
Even though we did not applied the strategy from a microestructure point of view, we did applied many things of the course, starting with the organization, the way of presenting the project and the dinamyc of calculating and being aware of little details.
We first were struggling doing research of which strategy put on practice so we spent great time discussing and sharing with the team some strategies, finally we found this one and chose to work on it because the metrics made sense for use and here is why:
As the RSRS is an indicator which concentrates on market momentum and we read that succeeds at providing readings for overbought and oversold market conditions and the EMA's are the most popular metrics identifying trends, combining both of them with the ATR in order to calculate an appropiate TP because the ATR is volatility indicator, this may lead to get a useful strategy.
After doing the backtesting and the optimization we found that the strategy is not ready for being deployed because the equity return[%] was below our expectations and in order to get a profitable strategy we must try with many more metrics.
Besides the results we are satisfied with the work we did and we are gladly to say that this project has encouraged us to keep learning this kind of topics.
[1] Library Pandas technical anaylisis https://technical-analysis-library-in-python.readthedocs.io/en/latest/ta.html
[2] Plotly Documentation https://plotly.com/
[3] Average True Range https://www.tradingview.com/wiki/Average_True_Range_(ATR)
[4] Stochastich Relative Strengh Index https://www.investopedia.com/terms/s/stochrsi.asp
[5] Exponential Moving Average https://www.investopedia.com/ask/answers/122314/what-exponential-moving-average-ema-formula-and-how-ema-calculated.asp